Like with \(\hat{p}\), the difference of two sample proportions \(\hat{p}_1 - \hat{p}_2\) can be modeled using a normal distribution (when conditions are met).
This standard error comes from the fact that variances of independent variables add, even when subtracting.
Differences of Two Proportions: Standard Errors
When we talk about the spread of an estimate, we’re really talking about variance (the square of the standard error).
If two random variables A and B are independent, then:
\(\text{Var}(A - B) = \text{Var}(A) + \text{Var}(B)\)
This might seem counterintuitive — but remember:
Even if you’re subtracting two noisy measurements, the uncertainty (noise) from both still adds up.
Think of it like using two shaky rulers. Subtracting doesn’t cancel the shakiness — it just combines it!
Differences of Two Proportions: Simulation Setup
We’ll use simulation to understand the sampling distribution of sample proportions for two independent groups.
Group 1 has a true proportion \(p_1 = 0.5\)
Group 2 has a true proportion \(p_2 = 0.4\)
Each group has \(n = 500\) individuals per sample
We’ll repeat this sampling 1,000 times to observe variation in sample means
Differences of Two Proportions: Simulation
set.seed(1) # ensures reproducibilityB <-1000# number of simulationsn <-500# sample size per groupp1 <-0.5# true proportion in group 1p2 <-0.4# true proportion in group 2# Create empty vectors to store simulated means and SDsmean_x1 <- mean_x2 <-numeric(B)sd_x1 <- sd_x2 <-numeric(B)# Loop to simulate samples for both groupsfor (i in1:B) {# Generate random binary outcomes for group 1 (success = 1, failure = 0) x1 <-rbinom(n, size =1, prob = p1) mean_x1[i] <-mean(x1) # sample proportion for group 1 sd_x1[i] <-sd(x1) # sample SD for group 1# Repeat for group 2 x2 <-rbinom(n, size =1, prob = p2) mean_x2[i] <-mean(x2) sd_x2[i] <-sd(x2)}
Comparing Theoretical and Empirical Standard Errors
We can compare:
the theoretical standard error (from the formula)
the empirical standard error (from our simulations)
Comparing Theoretical and Empirical Standard Errors
# Theoretical standard error for a sample proportionsqrt(p1*(1-p1)/n)
[1] 0.02236068
# Empirical standard error from simulated datamean(sd_x1) /sqrt(n)
[1] 0.0223613
The first line gives the theoretical SE: \(\sqrt{p(1-p)/n}\)
The second line gives the empirical SE, based on simulated SDs
These values should be nearly identical, validating the normal approximation for large \(n\)
Sampling Distribution for One Group (p1 = 0.5)
We can visualize the distribution of sample proportions across simulations, and overlay a 95% confidence interval around the true mean.
Sampling Distribution for One Group (p1 = 0.5)
Sampling Distribution for One Group (p1 = 0.5)
Roughly 5% of simulated sample proportions should fall outside this interval — confirming the 95% confidence level’s interpretation.
# Calculate the proportion of estimates that fall outside the 95% CImean(ifelse(mean_x1 >= ub_x1 | mean_x1 <= lb_x1, 1, 0))
[1] 0.052
Sampling Distribution for One Group (p2 = 0.4)
Sampling Distribution for One Group (p2 = 0.4)
Again, around 5% of simulated estimates will fall outside the interval.
The spread is slightly narrower than for Group 1 because the variance is smaller.
# Share of points outside the 95% CImean(ifelse(mean_x2 >= ub_x2 | mean_x2 <= lb_x2, 1, 0))
[1] 0.048
Combining Two Proportions
Now that we understand the sampling variation of each group separately, we can combine them just as we would when estimating a difference in proportions:
Consider an experiment involving patients who underwent cardiopulmonary resuscitation (CPR) following a heart attack and were subsequently admitted to a hospital.
Patients were randomly assigned to either a treatment group (received a blood thinner) or a control group (no blood thinner).
The outcome of interest was survival for at least 24 hours.
Differences of two proportions: Example 1
Survived
Died
Total
Control
11
39
50
Treatment
14
26
40
Total
25
65
90
Differences of two proportions: Example 1
Create and interpret a 90% confidence interval of the difference for the survival rates in the CPR study.
sig_level z_score min max
1 0.01 2.450 -0.10396538 0.3639654
2 0.05 1.950 -0.05621734 0.3162173
3 0.10 1.645 -0.02709104 0.2870910
Differences of two proportions: Visualizing Confidence Intervals
Differences of two proportions: Interpretation
We are 90% confident that blood thinners change the 24-hour survival rate by between -3 and 29 percentage points for patients similar to those in the study.
Because 0% is within this range, the evidence is inconclusive — we cannot determine whether blood thinners help or harm heart attack patients who have undergone CPR.
Differences of Two Proportions: Example 2
A 5-year clinical trial evaluated whether fish oil supplements reduce the risk of heart attacks.
Each participant was randomly assigned to one of two groups:
Fish Oil group
Placebo group
We’ll examine heart attack outcomes across both groups.
Differences of Two Proportions: Example 2
Group
Heart Attack
No Event
Total
Fish Oil
145
12,788
12,933
Placebo
200
12,738
12,938
Differences of Two Proportions: Example 2
Construct a 95% confidence interval for the effect of fish oil on heart attack incidence among patients represented by this study.
Interpret the interval in context:
What does the direction and width of the interval suggest?
Is there evidence that fish oil has a meaningful effect on heart attack risk?
Differences of two proportions: Visualizing Confidence Intervals
Differences of two proportions: Interpretation
The point estimate for the effect of fish oil is approximately –0.0043,
meaning heart attacks occurred 0.43 percentage points less often in the fish-oil group than in the placebo group.
We are 90% confident that fish oil changes the heart-attack rate by between –0.66 and –0.19 percentage points for patients similar to those in the study.
Because this interval does not include 0, the reduction in heart-attack risk is statistically significant at the 10% (and even 5% an 1%) level.
Practical vs. Statistical Significance
While statistically significant, the effect size is extremely small — roughly 0.4 fewer heart attacks per 100 individuals.
In a large clinical sample, even minor effects can reach significance if variability is low.
From a practical standpoint, such a small reduction may not justify the cost, side effects, or adherence burden of treatment.
More on Two-Proportion Hypothesis Tests
When conducting a two-proportion hypothesis test, the null hypothesis is typically: \(H_0: p_1 - p_2 = 0\)
However, there are cases where we may want to test for a specific difference other than zero.
For example, suppose we want to test whether: \(H_0: p_1 - p_2 = 0.10\)
In contexts like these, we use the sample proportions \[\hat{p}_1\] and \[\hat{p}_2\] to check the success–failure condition and to construct the standard error.
Differences of Two Proportions: Example 3
A drone quadcopter company is considering a new manufacturer for rotor blades.
The new manufacturer is more expensive but claims that their higher-quality blades are 3% more reliable, meaning that 3% more blades pass inspection compared to the current supplier.
sig_level z_score min max
1 0.01 2.450 0.0009547225 0.05704528
2 0.05 1.950 0.0066782486 0.05132175
3 0.10 1.645 0.0101695994 0.04783040
Visualizing Confidence Intervals
Compute and Visualize the z-Statistic
z <- (point_est -0.03) / seset.seed(1)sim <-rnorm(1000, mean =0.03, sd = se)# Probability of observing a value this extreme or larger1-mean(ifelse(point_est >= sim, 1, 0))
[1] 0.004
# p-value (right-tailed)1-pnorm(z)
[1] 0.005648044
Visualizing the Sampling Distribution
Example 3: Conclusion
From the standard normal distribution:
The right-tail area is approximately 0.004
Doubling for a two-tailed test gives p = 0.008
Since p = 0.008 < 0.05, we reject the null hypothesis
We find statistically significant evidence that the higher-quality blades have a pass rate greater than 3% higher than the standard blades — exceeding the company’s claims.
Chi-Squared Distributions: Introduction
\(\chi\) = the greek letter for “chi” (pronounced like “kai”)
The \(\chi^2\) distribution is a continous probability distribution that is widely used in statistical inference.
Closely related to the standard normal distribution
If a variable \(Z\) has the standard normal distribution, then \(Z^2\) has the \(\chi^2\) distribution with one degree of freedom
Chi-Squared Distributions: Histograms
Chi-Squared Distributions: Definition
If \(Z_1, Z_2,..., Z_k\) are independent standard normal variables, then…
\(Z_1^2 + Z_2^2 +...+ Z_k^2\)
…has a \(\chi^2\) distribution with \(k\)degrees of freedom.
Degrees of Freedom: Concept
A degree of freedom (df) represents the number of independent pieces of information available to estimate something.
Whenever we calculate a statistic, we “use up” some information.
For example, once we estimate the sample mean, one data point can be perfectly predicted from the others.
So for a sample of size \(n\), only \((n - 1)\) observations are free to vary when computing the sample variance.
if \(i = 4\), \(x_1 = 8\), \(x_2 = 10\), \(x_3 = 12\) and \(\bar{x} = 10\)…
… then \(x_4 = 10\)
So even though we had four data points, only three were free to vary — the fourth is determined by the mean.
That’s why when calculating the sample variance, we divide by \(n\) instead of \(n-1\): one degree of freedom has been “used up” in estimating the mean.
Degrees of Freedom: Motivation
Why it matters:
Degrees of freedom tell us how much independent information our test or estimate is based on.
They affect the shape of sampling distributions (like \(t\), \(F\), and \(\chi^2\)), which in turn changes the critical values and p-values we use.
More degrees of freedom → more information → the distribution becomes narrower and more normal-looking.
Big idea:
Degrees of freedom link sample size, uncertainty, and the reliability of inference — they remind us that every time we estimate something, we “spend” information.
Chi-Squared Distributions: Properties
mean: \(\mu = k\)
variance: \(\sigma^2 = 2k\)
mode occurs at \(\mu - 2\)
set.seed(1)x <-rnorm(1000, mean =0, sd =1)x2 <- x^2mean(x2)
[1] 1.070115
var(x2)
[1] 2.291541
Chi-Squared Distributions: Multiple Degrees of Freedom
Chi-Squared Distributions: Multiple Degrees of Freedom